Knowledge domains and emerging trends of Genome-wide association studies in Alzheimer’s disease: A bibliometric analysis and visualization study from 2002 to 2022

Objectives Alzheimer’s disease (AD) is a neurodegenerative disorder characterized by a progressive decline in cognitive and behavioral function. Studies have shown that genetic factors are one of the main causes of AD risk. genome-wide association study (GWAS), as a novel and effective tool for studying the genetic risk of diseases, has attracted attention from researchers in recent years and a large number of studies have been conducted. This study aims to summarize the literature on GWAS in AD by bibliometric methods, analyze the current status, research hotspots and future trends in this field. Methods We retrieved articles on GWAS in AD published between 2002 and 2022 from Web of Science. CiteSpace and VOSviewer software were applied to analyze the articles for the number of articles published, countries/regions and institutions of publication, authors and cited authors, highly cited literature, and research hotspots. Results We retrieved a total of 2,751 articles. The United States had the highest number of publications in this field, and Columbia University was the institution with the most published articles. The identification of AD-related susceptibility genes and their effects on AD is one of the current research hotspots. Numerous risk genes have been identified, among which APOE, CLU, CD2AP, CD33, EPHA1, PICALM, CR1, ABCA7 and TREM2 are the current genes of interest. In addition, risk prediction for AD and research on other related diseases are also popular research directions in this field. Conclusion This study conducted a comprehensive analysis of GWAS in AD and identified the current research hotspots and research trends. In addition, we also pointed out the shortcomings of current research and suggested future research directions. This study can provide researchers with information about the knowledge structure and emerging trends in the field of GWAS in AD and provide guidance for future research.


Introduction
Alzheimer's disease (AD) is a degenerative neurological condition characterized by a progressive decline in cognitive and behavioral functions, ultimately resulting in mortality [1].As the most prevalent type of dementia, AD accounts for around 60-80% among all cases of dementia, emerging as one of the most costly, deadly, and burdensome diseases of this century [2,3].It has been more than 100 years since AD was first reported by Dr. Alois Alzheimer in 1906, yet it is still largely unknowable to us.Therefore, it is important to explore the underlying pathogenesis and identify causative and protective genes for us to better understand and treat AD.Studies of twins have demonstrated that genetic factors account for the major factors in the risk of AD [4].Since the 1980s, APP, PSEN1 and PSEN2 have been found to be the cause of early-onset AD (EOAD), based on systematic linkage analysis [5].However, EOAD accounts for only about 10% of the total number of AD, and it is late-onset AD (LOAD) that is the most common form of AD.Studies have shown that genes play an important role in the etiology of LOAD [6].In 1993, ε4 of apolipoprotein E (AOPE) was found to be strongly associated with LOAD [7].This groundbreaking discovery attracted our interest in genetic studies of AD.It was not until the advent of high-throughput genomic approaches, particularly genome-wide association studies (GWAS), that other risk factors associated with AD were gradually identified.GWAS, as an emerging tool to identify genetic risk factors associated with complex diseases, can identify single nucleotide polymorphisms (SNPs) more precisely by analyzing large amounts of genetic data to identify genetic locus variants associated with diseases [1].Prior to the advent of GWAS, studies of AD were usually conducted using traditional methods such as family lineage studies, candidate gene studies, cellular and animal experimental studies, epidemiological studies, and neuroimaging studies.Although these methods can identify AD risk genes and pathogenesis to some extent, they are usually limited by the sample size, complex experimental design, and individualization bias, etc. GWAS enables the simultaneous assessment of the relationship between millions of genetic variants and AD, complementing the limitations in the sample size inherent to traditional methods and minimizing the occurrence of false-positive results to the greatest extent.Meanwhile, GWAS does not depend on specific biological samples or experimental conditions, thus it can reduce experimental bias and improve the accuracy of results.Particularly in the context of AD, GWAS excels in elucidating the disease's multigenic effects and has facilitated the exploration of specific proteins and their roles in AD, providing insights into the underlying mechanisms of disease progression.The public availability of most GWAS-related databases and studies also fosters further research and collaboration in the field.As an effective method for genetic studies, GWAS has been widely used in the study of AD.In 2009, the first two large-scale GWAS results on AD were successively published, reporting the associations of CLU, CR1, and PICALM with the risk of AD [8,9].This groundbreaking achievement became a milestone in the field.Since then, a large number of GWAS for AD have been carried out and many novel risk loci for AD have been identified, such as BIN1, TREM2, CD33, SORL1, CD2AP, ABCA7 and EPHA1 [10].These loci have also been shown to be involved in multiple biological pathways in vivo, including immune actions, amyloid and tau protein processing, lipid metabolism, etc [8,[11][12][13].The GWAS of AD reveals more genetic factors associated with AD.Furthermore, our understanding of the AD pathology can be enhanced by analyzing the biological functions of these genes, ultimately with a better guidance to AD prevention and treatment.
At present, a large number of studies on GWAS in AD have been conducted, generating amounts of scientific literatures, yet there is no study that has systematically summarized them.Bibliometrics is the study of academic publications, which assesses the basic characteristics and research hotspots of a field by quantitatively analyzing the relevant information of publications (including year of publication, journal of publication, author, country, keywords, etc.) [14].Bibliometrics and visualization research methods have been widely used in various studies within the field of medicine.With the application of such methods, development trend in a particular disease, as well as in specific treatments or research methods for certain diseases, can be analyzed.CiteSpace and VOSviewer [15,16], as two bibliometric software, can make statistical analysis and form a visual knowledge map based on publications.This study will use CiteSpace and VOSviewer to conduct a comprehensive analysis of GWAS in AD, in order to comprehensively summarize the development and evolution of the field, and to predict future research trends.It is anticipated that this study will help researchers gain a quick and comprehensive understanding of the field and provide certain ideas and directions for further research.

Inclusion and exclusion criteria
Included articles are reviews and research articles on GWAS in AD; conference articles and articles that are not formally published will be excluded.The language will be limited to English.In addition, the included articles will be screened to exclude duplicate articles.

Bibliometric and visual analysis
Export the retrieved articles from WOS with full records and references.The target file would be exported in plain text format, named "download_X.txt".CiteSpace 6.1.R6 would be used for visual analysis of authors and co-cited authors collaboration networks, co-cited reference and research hotspots; VOSviewer 1.6.18would be used for visual analysis of countries/regionals and institutions collaboration networks.

Trends in the number of publications
A total of 2,752 articles were retrieved from 2002 to 2022 (Fig 1), of which 2,217 were articles and 454 were reviews.Based on the trend of publication volume, it can be seen that the number of articles published on GWAS in AD has been increasing in the last 20 years.Among them, the number of published articles grew rapidly from 2009 to 2012, and reached the maximum number of published articles in 2021, which was 313 articles.Compared with 2021, the number of articles published in 2022 is slightly lower, which may be due to the fact that some articles could not be included because 2022 was not finished when this study was conducted.

Countries (regions) and institutions
During 2002 to 2022, GWAS in AD have been published in 85 countries and regions (Fig 2A).Table 1 lists the top 10 countries and regions with the most publications.According to the results, it is evident that the United States published the most studies in this field, which has arrived at 1459 articles in the 20-year period, accounting for 53% of total publications.
Institutional collaborations (Fig 2B) involved a total of 2950 institutions, mainly dominated by universities and hospitals.Table 2 illustrated the top 10 institutions by publication counting, including universities and clinical institutions in the United States, the United Kingdom, and China.Among them, Institutions from the United States reached the highest number of published articles, accounting for 80%, and Columbia University is the institution that publishes the most publications.In the visualization map, different colors represent different clusters, and the lines represent inter-institutional collaboration.It indicates that close interinstitutional academic cooperation occurs in this field, and domestic cooperation is the main way.International cooperation should be strengthened in the future.

Analysis of authors and co-cited authors
The study of GWAS in AD was analyzed based on author's collaboration network (Fig 3A) and author's citation network (Fig 3B).Table 3 lists the top ten authors by the number of publications and citations, among which the top three authors with the most publications are Bennett, Tan and Yu, with 93, 91 and 90 publications, respectively.Lambert JC was the most cited author with 1463 citations, followed by Harold D with 990 citations.The remaining authors with a high number of citations had little difference in the number of citations.

PLOS ONE
Knowledge domains and emerging trends of Genome-wide association studies in Alzheimer's disease

Co-cited reference analysis
The number of citations to an article represents the degree of attention to this topic.Namely, a higher number of citations means that the publication is more important.We used CiteSpace to visually analyze the references of the articles (Fig 4).The time slice was set to two years, and the top 30 most cited projects were selected from each slice.Nodes indicate the author and year of publication of the reference, the size of the node indicates the frequency of citation, the color of the node indicates different years, the connection between the nodes indicates that two references were cited at the same time, and the thickness of the line indicates the strength of the relationship.We list information of the top 10 most cited articles (Table 4).The articles are concentrated on three journals, Nature Genetics, JAMA and New England Journal of Medicine, of which a total of seven articles are from Nature Genetics.These 10 articles were all research articles which analyzed case group and control group through GWAS to find risk sites associated with AD.Harold D (2009) was the most frequently cited article, which identified the association of CLU and PICALM with AD by GWAS.In addition, Lambert JC (2013), Naj AC (2011), Hollingworth P (2011), Kunkle BW (2019), Jansen IE (2019), Seshadri S (2010) and Guerreiro R (2013) also combined the methods with meta-analysis.

Analysis of research hotspots
As a label for an article, keywords can help readers quickly grasp the topic and content of the article.Analyzing the keywords of an article can give us a better understanding of the research  hotspots and frontiers in the field [17].In this study, a co-occurrence analysis and a cluster analysis (Fig 5 ) were performed on the keywords of the articles by using VOSviewer.After excluding the subject terms "genome-wide association studies" and "Alzheimer's disease", we summarized the top 10 keywords that appeared most frequently.As shown in Table 5,  "identifies variants" was the most frequent words with 637 occurrences.Identifying variants is one of the core steps of GWAS, which refers to the process of discovering SNPs associated with specific features during the analysis of large amounts of genotype data.In addition, common keywords include risk, common variant, expression and loci, which are common steps and key concepts in the GWAS process.The APOE gene is located on chromosome 19 and has three alleles, APOE ε2, APOE ε3 and APOE ε4 [18].Several studies [19][20][21][22] have demonstrated that APOE is the most strongly correlated genetic risk factor for LOAD, and it is a hot topic of current research, with a total of 435 occurrences in the keyword co-occurrence statistics.Meta-analysis as a research method has been often applied to GWAS, and by using metaanalysis, multiple independent GWAS results can be pooled to improve the reliability of the results and to better summarize the role of genetic variation in AD.
The results of the cluster analysis showed that these keywords were grouped into six categories.Table 6 summarizes the clusters and lists the keywords with higher frequency according to the frequency of occurrence.The first category is mainly concerned with the diagnostic, biomarker and molecular mechanisms of AD.Both phosphorylated tau protein and inflammatory response are typical pathological features and pathogenic hypotheses of AD, which are the hot topics of current research in AD pathology [23][24][25].Gene expression variants are intimately associated to the pathogenesis and progression of AD, which can be identified by GWAS, and the corresponding diagnostic and therapeutic targets could be investigated based on the results.Biomarkers have great potential for the diagnosis and treatment of AD, meanwhile, protein and gene expression variants associated with abnormal tau protein deposition and inflammatory response are currently common research directions for biomarker studies [26][27][28][29][30].The second category is mainly related to genetic factors of AD and related researches.It consists of identifying susceptibility loci for AD and analyzing the effect of APOE on AD by the application of GWAS.Cerebrospinal fluid contains a large number of molecules related to the nervous system, which often used as a research object in GWAS to find genetic risk factors related to AD [31,32].Aβ, produced when amyloid precursor protein (APP) is abnormally cleaved by β-secretase and γ-secretase, is one of the common pathological features and   markers of AD [33].When Aβ accumulates excessively it leads to cognitive impairment and the amyloid hypothesis is currently one of the most supported hypotheses for the pathogenesis of AD [24,34].The third category of keywords is mainly related to GWAS, including metaanalysis and Mendelian randomization (MR), which are common research methods in GWAS.MR is a method that uses genetic variation as a tool for causal inference [35].MR, as a valuable tool in the field of GWAS, is often used in GWAS to assess the causal relationship between risk factors and certain manifestations of the disease [36].The fourth category is similar to the third one, which is related to GWAS in AD, with keywords including risk, susceptibility, gene, onset, mutations, etc.The fifth category was mainly related to potential risk factors and genetic variants of AD.Most of the keywords were specific genes associated with potential risk of AD, including CLU, CD2AP, CD33, EPHA1, PICALM, CR 1, ABCA7, and SORL1.The sixth category had fewer keywords and more scattered topics, including risk loci, national institute, diagnostic guidelines and innate immunity.
With the development of this research field, its hotspots are also changing with time, and only analyzing keyword clustering has certain limitations which cannot reflect the changes of keywords over time.Keyword burst refers to a sudden increase in the frequency of keywords in a certain time period, meanwhile, analyzing keyword burst could provide assistance to understand the development trend of GWAS in AD research and identify research hotspots.In this study, we used CiteSpace for keyword burst analysis, and the top 20 burst keywords are listed in Fig 6 .Missense mutation was the longest burst keyword, which lasted for almost 10 years.Meanwhile, we found that the keywords of burst before 2010 were mainly about the basic concepts of GWAS, and common pathological features and genetic risk genes of AD.It might be explained by the fact that the GWAS in AD field was just emerging that researchers mainly focused on the study of basic concepts.After 2010, researchers started to conduct a large number of GWAS-related studies of AD, and some new AD susceptibility genes started to appear in the outbreak keywords, indicating that researchers focused their research on the exploration of new risk genes.Mendelian randomization, tau, meta-analysis, Aβ, cognitive, and risk factor are the keywords that persist to date and are the hot directions of current research.In addition, mendelian randomization and tau were the most recent keywords among these words and became the focus of attention as soon as they appeared.

Discussion
This study conducted an analysis of the GWAS in AD using the method of bibliometrics.The article information was visualized and mapped using CiteSpace and VOSviewer, and a comprehensive analysis was performed based on the research results.Between 2002 and 2022, a total of 2752 articles were retrieved, of which 2217 were research articles, accounting for 81% of the total, and 454 were published in review.Currently many researchers conduct related studies, however, there are fewer review articles on this research and a systematic summary of the field is lacking.This may be due to the fact that GWAS in AD is a relatively new field, however, with the continuous research and a large number of new findings coming out, researchers are more inclined to produce original research papers to present the latest findings.Over the past 20 years, the number of publications has shown a consistent upward trend.The concept of GWAS can be traced back to 1996 [37], but it was not until a research article on GWAS published in Nature in 2005 [38] that it began to receive widespread attention and application.Therefore, only a few articles mentioned the concept of GWAS related to AD in the early days, and the number of related studies started to increase after 2005.The period from 2009 to 2012 was a period of rapid growth, representing the growing attention in this field, and since 2014, it has been in a steady growth state.Genetics research on AD has always been a highly interest research area.Previously, genetic research on AD was limited by sample size, frequency of specific variants and other factors, while SNPs at millions of loci could be analyzed by GWAS, allowing for higher resolution screening on the human genome to detect common diseases and disorders [39].In addition, as an unbiased research method, it can exclude researchers' subjective bias and discover more unknown gene-disease associations.Therefore, GWAS has attracted the attention of scholars in AD field, and a large number of related studies have been carried, and GWAS in AD has gradually become a popular research field.GWAS in AD have been published in 85 countries and regions, of which the United States having the highest number of publications and leading the field.The countries with the next highest number of publications are the United Kingdom and China.Due to factors such as geography and language, European countries tend to cooperate more frequently.Racial differences may have an impact on the study results, since the current study sample is mainly from Europe, and attention to other races should be strengthened in the future.
Analyzing the number of citations of articles can help us find the core research in the field.According to our survey, the top 10 most cited articles are all research articles, published mainly between 2009 and 2013.The most cited article was a study published in NATURE GENETICS in 2009 [8], which was the first large-scale GWAS in AD involving over 14,000 participants.This study detected an association between the CLU and PICALM genes and AD risk through GWAS.Another study, also published in 2009, was conducted by Lambert JC et al. [9].In addition to the association of CLU with AD, they also reported an association of CR1 with AD risk.In 2013 Lambert JC et al. performed a meta-analysis of data from the GWAS, the study population of European ancestry involving 74,046 individuals.This study identified four new susceptibility loci in addition to the analysis of known genes, providing strong evidence for the importance of APP and tau in AD pathology [40].The remaining high-frequency cited articles all investigated AD risk genes through GWAS, including MS4A4A, MS4A6A, TREM2, ABCA7, CD33, CD2AP, EPHA1.In addition, we found two studies [12,13] that were published less than three years but had a high number of citations that deserve readers' attention.Jansen IE et al. [12] conducted a large-scale GWAS on AD and proxy AD of European ancestry, with the study being divided into three phases and encompassing over 400,000 participants.Similarly, Kunkle BW et al. [13] conducted a large-scale GWAS on LOAD, involving 94,437 individuals.Compared to the previous study, this research primarily focused on non-Hispanic white populations.In addition, both studies identified new risk loci, and the analyses revealed multiple pathogenic factors associated with AD, such as immune response, lipid metabolism, inflammation, Aβ, and tau proteins.It is noteworthy that both studies employed GWAS and meta-analysis methods, conducting large-scale research on AD patients and non-AD populations.This large-scale approach, with its excellent statistical validity and broad sample representation, ensured the stability, accuracy and reliability of the findings.This might be one of the reasons why these two studies have received substantial attention in a short period.Furthermore, the risk genes identified were thoroughly discussed, and the pathogenesis of AD was explored, which not only enhances the understanding of AD but also provides a basis for the pathogenesis and targeted treatment of AD.
Furthermore, the recently published articles were analyzed separately.It was found that recent publications have focused on GWAS and meta-analyses of multiracial populations as a way to validate previous findings and identify new genetic loci.Also, by continuing to analyze different races, it is possible to increase our understanding of the variability of AD genes across races.In addition, some of the current research focuses on in-depth studies of specific genes to enhance the understanding of the genetic and pathological mechanisms of AD.Recently, an article exploring the role of CD33 isoforms in microglia and AD has gained the attention of researchers after its publication.CD33 is one of the high-risk genes associated with AD risk.It is suggested that CD33 might influence the pathological process of AD by regulating the function of microglia.This article not only reviews CD33-related research, but also explores the potential possibility of targeting CD33 for the treatment of AD.
With the increasing sophistication of GWAS research and computer technology, researchers are incorporating advanced techniques to further optimize GWAS studies.Yang et al., in their latest publication, introduced Causal Analysis using Regression Model Averaging (CARMA) to optimize fine-mapping in genome-wide meta-analyses, ameliorating the challenge of distinguishing between pathogenic and non-pathogenic variants in GWAS methods [41].CARMA is a statistical method designed to help researchers gain a better understanding of the true causal relationships between variables.It robustly estimates causal effects in observational data, accurately identifying and addressing potential confounding issues.LD information in previous studies is usually obtained from external reference panels.However, this may lead to inconsistencies between the LD information and the GWAS summary statistics, resulting in biased mapping results.CARMA, a Bayesian model, not only explains the discrepancies between the summary statistics and the LD from the reference panels, but also integrates the data with the functional annotations, which improves the GWAS results' accuracy [41].In recent years, the use of machine learning (ML) to detect and classify diseases has attracted a lot of attention, and more researchers are beginning to use ML to address complex diseases such as AD [42].ML, especially deep learning, is considered a powerful tool for GWAS data analysis due to its ability to process large-scale data, automatically extract key features, and identify complex interaction effects between multiple SNPs [43].A study employed a convolutional neural network model combined with principal component analysis to process MRI data from brain regions.The extracted feature vectors were then used as endophenotypes in GWAS to identify genetic variants linked to AD.This automated feature extraction approach offers more comprehensive information compared to traditional methods, enhancing the accuracy of predicting genetic risk genes.Furthermore, integrating brain imaging data with GWAS offers new insights into the genetic and biological mechanisms of AD.
In this study, keywords were analyzed to find out the trends and research hotspots in the field.A total of 8566 keywords and six clusters were obtained.Comprehensive analysis of the keyword and keyword burst results indicated that GWAS in AD mainly focuses on discovering susceptibility genes and studying their effects on AD.More than 100 AD-associated risk loci have been identified in existing studies.In this study, the hot genes currently under investigation were identified based on keyword analysis (Table 7).As one of the most significant genetic risk factors of AD, APOE has always been a focus of research [44].APOE is a protein consisting of 299 amino acids, primarily involved in the transport and metabolism of cholesterol.It is believed to play a vital role in the brain, particularly in lipid transport and damage repair.APOE assists in the transport and metabolism of lipids, such as cholesterol, by binding to lipid molecules and forming lipoprotein particles [20,45].GWAS have confirmed the APOE ε4 allele as a major genetic risk factor of AD, and in particular, it is strongly associated with the risk of LOAD [8,9].The effect of APOE4 on AD is thought to implicate multiple pathological manifestations.It was shown that APOE can promote the metabolism and clearance of Aβ, while the variation of the APOE4 gene would affect the ability of APOE to bind Aβ, thereby disrupting the metabolism and clearance of Aβ, leading to abnormal accumulation and deposition of Aβ and increasing the risk of AD [46].In addition, APOE4 can worsen neurodegeneration that is mediated by tau and impact the pathology of tau proteins [47][48][49].APOE4 has also been found to cause metabolic dysregulation in astrocytes and microglia, leading to inflammation, neuronal damage and other AD pathologies [50].In addition, APOE4 can impair myelin formation in the brain by interrupting astrocyte-derived lipid transport, thereby disrupting neural signaling and leading to cognitive and motor deficits in AD patients [51].APOE4 has been found to be expressed in various ethnic groups, including African descent, Caucasian, Hispanic, Asian, etc., with a higher frequency of expression in African and Caucasian [52].The effects of APOE4 on different races are still unclear, and a number of experiments have been conducted to explore.For instance, one study involving African, Caucasian, and Latino populations revealed that APOE4 was associated with poorer performance in episodic memory, particularly among Caucasians [53].Conversely, another study encompassing Hispanics, African Americans, Hispanics, and non-Hispanic white Americans indicated that the influence of APOE4 on AD did not exhibit significant racial variability [54].Although APOE4 carriers show a higher prevalence in all races, the extent to which there is racial variability in the impact of AD deserves further study in the future.At the same time, researchers are also currently focusing on whether APOE4 has gender-differentiated effects on AD.It has been found that APOE4 has a more significant effect on women in terms of cognition and language [55].Individuals carrying APOE4 in females experienced a faster decline in memory capacity than males, and in addition, APOE4 showed a stronger association with tau pathology in females [56][57][58][59].
CLU is another major brain apolipoprotein gene after APOE, which is widely expressed in the central nervous system [60].Similar to APOE, CLU is synthesized and released by astrocytes and neurons, playing a role in lipid metabolism within the brain [61].Harold et al. [8] and Lambert et al. [9] discovered through GWAS that this gene is associated with the risk of AD.Studies have shown that the level of CLU will increase when the brain is damaged or chronically inflamed.Animal experiments have shown that mice deficient in CLU are more susceptible to Aβ neurotoxicity.It has also been suggested that CLU may be beneficial for the clearance of Aβ in vivo [62,63].In addition, CLU has been shown to cooperate with APOE in inhibiting Aβ deposition [62].Depending on the SNP variation, the CLU gene forms several different alleles, which affect AD to varying degrees.Case-control studies in Chinese population have demonstrated the effect of CLU on AD susceptibility [64][65][66][67].Notably, the T allele of rs11136000 and the A allele of rs2279590 showed significant protective effects [64].The T allele of rs11136000 was found to be associated with better cognitive performance in older adults [68,69].The role of rs11136000 has also been confirmed in meta-analyses involving both Caucasian and Asian populations [65,70,71].In Caucasian populations, CLU rs93331888 and CLU rs11136000 were found to be associated with AD [8,9].However, no significant association was observed in studies involving Asian populations [72,73].This suggests a notable racial variation in the impact of CLU on AD, highlighting the need for further research through large-scale experiments in the future.
At the same time, Lambert et al. [9] also found that the CR1 gene could participate in Aβ clearance together with APOE.The CR1 gene, located on chromosome 1, serve as the primary receptor for complement proteins and can influence AD pathology by regulating complement protein activity.GWAS has revealed that CR1 shows a strong correlation with LOAD risk.Studies suggest that CR1 primarily affects the clearance of Aβ by participating in the body's immune system [74].Various SNPs of CR1, including rs11118322, rs17259045, rs12567945, rs1323721, etc., have been found to affect Aβ levels in the brain to different extents, among which rs12567945 has an inhibitory effect on Aβ.Additionally, CR1 rs6656401 and rs4844609 were also found to be associated with intelligence level, which may affect cognitive function in AD [9,75,76].
CD2AP is a protein-coding gene whose product is thought to be involved in the regulation of receptor-mediated endocytosis and the immune system [77].GWAS found that CD2AP is associated with tau protein toxicity, thereby affecting the occurrence of AD [13].This result was also confirmed in the Drosophila AD model [78].A cohort study has found that CD2AP rs9296559 is associated with higher tau levels in cerebrospinal fluid [79].In addition, different SNPs of the CD2AP gene are believed to affect Aβ levels; for instance, CD2AP rs9349407 was found to cause an increase in plaque load [80].Animal studies have found that CD2AP also leads to an increase in Aβ and the Aβ42/Aβ40 ratio [81].CD2AP is also thought to potentially have an indirect effect on AD by interacting with other genes or influencing cardiovascular and other risk factors [77].Numerous studies have been conducted on different races to investigate the effects of CD2AP on AD and racial variability.However, the results of current studies remain controversial.A study of CD2AP rs9349407 concluded that studies in Asians, Americans, and Europeans may be negative due to factors such as small sample size.A meta-analysis of this study by expanding the sample size showed that CD2AP rs9349407 was associated with AD susceptibility [82].More large-scale studies should be conducted in the future to help us better understand the relationship between the CD2AP gene and AD in different ethnic backgrounds.
The CD33 gene encodes a cell surface receptor protein, the primary function of which involves immune regulation and cell signaling [83].Studies have been conducted on multiple SNPs of CD33.One GWAS identified a correlation between CD33 and AD, reporting that rs3826656 affects the development of late-onset AD [84].Subsequently, numerous GWAS have been conducted to determine the association of other SNPs of CD33 with susceptibility to AD, including rs3865444, rs12459419, rs2455069, etc [83,[85][86][87][88]. Different polymorphisms of CD33 are thought to influence the AD pathological process by affecting microglial activation, interfering with microglial-mediated Aβ clearance, and promoting the accumulation of senile plaques [89][90][91].EPHA1 is a membrane-bound protein that is mainly involved in synaptic development and plasticity [92].In addition, it is also thought to play a role in inflammation and apoptosis [93,94].A GWAS study found that the non-coding form of EPHA1 showed an association with AD [95].Polymorphisms in EPHA1 may influence the pathogenesis of AD by affecting the production and clearance of Aβ [91].In addition, the GWAS study reported that the rs11767557 variant of the EPHA1 gene appeared to be more highly correlated with susceptibility to AD in European populations, particularly with LOAD [95].PICALM is widely expressed in all cells, with significant expression in neurons.It is considered to have a crucial role in cellular endocytosis, neuronal development, and synaptic plasticity.PICALM could affect the processing of APP through the endocytic pathway, which ultimately results in alterations in Aβ levels [8].In addition, PICALM is thought to be involved in the regulation of tau proteins and to influence tau pathology.PICALM expression has been found to be dysregulated in the brains of LOAD patients, as evidenced by increased immunoreactivity in microglia and reduced protein levels in microvessels [96,97].Different loci of PICALM have been studied for AD risk in various races.The effects of different loci of PICALM on AD exhibit some racial variability, with a stronger correlation to AD risk observed in Caucasians.This may be due to the fact that more experiments are currently being conducted on Caucasians, and attention to other races should be increased in the future.For example, PICALM rs592297 demonstrated a risk for AD in Caucasians; however, it did not seem to have an effect on Asians [98].There are still limitations, such as small sample sizes, in the current research on other races, so that large-scale experiments should be further conducted to study other races to improve the accuracy of the conclusions.
ABCA7 belongs to the family of transporter proteins and is predominantly expressed in microglia and neurons [99].It suggest that ABCA7 is mainly involved in cell phagocytosis and lipid regulation [99,100].ABCA7 is thought to be an important risk gene for LOAD [101].The exact mechanism by which ABCA7 functions remains unclear, but it may affect AD by regulating the ability to transfer phospholipids to lipoproteins, such as APOE and CLU [102,103].Additionally, ABCA7 may also impact AD by regulating the phagocytic activity and APP processing in microglia [100,104].Studies have also revealed that mutations in the ABCA7 gene lead to increased Aβ production and neuroinflammatory plaques in vivo [80,104].ABCA7 has been identified as a risk gene for AD across multiple ethnicities, with certain polymorphisms demonstrating notable racial predispositions, such as rs115550680 in African Americans [105], rs3764648, rs3752229, rs150594667, and rs4147914 in Asians [106], and rs3764650 in Caucasians [107].ABCA7 is believed to exert a more potent impact on individuals of Africans and is considered to have a stronger effect on African Americans compared to APOE [105].Moreover, different alleles of ABCA7 appear to exhibit gender variability in their impact on AD.Research has found that the ABCA7 SNP (rs3764650) is significantly associated with cognitive impairment exclusively in women, while ABCA7 rs3764650 is only related to cognitive dysfunction in men [108].Such discrepancies may be attributed to factors like sample size and study population, deserving further in-depth investigation in the future.TREM2 is a protein primarily expressed on the surface of immune system cells, with the most significant expression in microglia cells in the brain [109].TREM2 plays a vital role in the phagocytosis of amyloid plaques by microglia cells, and mutations in the TREM2 gene may interfere with microglial function, leading to an increased risk of AD [110,111].The susceptible loci that have been identified affect AD mainly through various functions including APP processing, lipid metabolism, endocytosis, Aβ and tau accumulation, and immune regulation.Microglia in the brain are believed to play a crucial role in immune regulation and thus warrant the attention of researchers.It is important to note that this study only highlights a few risk genes that are currently being extensively researched.In recent years, additional candidate loci such as BIN1, SORL1, MS4A, SPI1, TOMM40, etc. have been identified through GWAS, and these genes also deserve continued attention and further research in the future.
Based on the analysis results of GWAS-related articles in this study as well as data from other gene databases, the top 10 biological, molecular, and cellular pathways most related to AD were identified.They are the metabolism of Aβ protein, the phosphorylation and aggregation of tau protein, the neuroinflammatory response, cholesterol metabolism and transport, cell apoptosis and survival pathways, neuronal signal transduction, oxidative stress response, organization and stability of the cytoskeleton, synthesis and release of neurotransmitters, and neural growth and repair.Aβ and phosphorylated tau are two hallmark pathological manifestations of AD [3].The metabolism of Aβ encompasses its production, processing, and degradation.Aβ is produced through the aberrant cleavage of APP, and when APP is abnormally cleaved by β-secretase and γ-secretase, it results in the formation of an Aβ fragment [118].A variety of genes have been found to be involved in Aβ metabolism.APOE is thought to affect Aβ metabolism in an isoform-specific manner.APOE4 is thought to significantly impair Aβ clearance, while APOE2 only slightly inhibits it [119].In addition, genes such as BACE1, CR1, and ABCA1 have been found to affect AD by influencing the Aβ metabolic pathway [120,121].
The accumulation of Aβ may interfere with the synthesis and release of neurotransmitters, leading to impaired communication between neurons [122][123][124].At the same time, nerve injury may also affect the release of nerve growth factor, further affecting neuronal growth and repair.Tau protein is a major component of neurofibrillary tangles, and aberrant phosphorylation of tau protein may lead to its aggregation to form neurofibrillary tangles, affecting AD pathology [125].In addition, abnormal aggregation of tau proteins may disrupt the cytoskeleton of neurons, leading to morphological and functional changes in neurons.APOE has been found to directly affect tau proteins [49].In addition, genes such as CR1, SORL1, and ABCA7 have also been identified to influence AD by affecting the phosphorylation and aggregation of tau protein [126,127].Neuroinflammation is thought to play a crucial role in the progression of AD.Accumulation of Aβ may activate microglia and astrocytes, leading to an inflammatory response that accelerates neuronal damage [128].APOE4 has been found to increase the inflammatory response, while the presence of inflammation exacerbates symptoms in patients with APOE4 [129].CLU, CR1, TREM2, and CD33 have also been identified to involve in the AD inflammatory response [130,131].Cholesterol plays a crucial role in the structure and function of neuronal membranes [132].APOE4, CLU and ABCA7 were found to involve in cholesterol metabolism [133].Abnormal cholesterol metabolism and transport may affect Aβ production and clearance, further accelerating the progression of AD [134].Cell apoptosis is a biological mechanism that can lead to AD when excessive apoptosis occurs [135].Meanwhile, neuronal death in brain regions, induced by apoptosis, can result in cognitive deficits [136].APOE4 has been found to induce apoptosis and synaptic dysfunction, thus aggravating AD.Additionally, CLU knockout has been observed to cause apoptosis in AD cells [137].The accumulation of Aβ might interfere with inter-neuronal signaling, leading to cognitive decline.Oxidative stress is considered a key factor in AD to lead to neuronal damage [138].Multiple genes, such as APOE4 and CLU, have been found to be associated with the oxidative stress response in AD [139][140][141].
In addition to studying susceptibility genes and their biological mechanisms, risk prediction for AD is also a hot research interest in the field.Risk prediction research is mainly based on genetic risk factors or biomarkers to establish risk assessment models, as well as establishing risk assessment models through neuroimaging analysis.Established risk assessment models based on biomarkers generally involves detecting AD-related biomarkers (such as tau protein and Aβ protein in cerebrospinal fluid) and combining them with individual age, gender, family history, and other factors for comprehensive evaluation.In addition, because of the similar or partially overlapping pathological mechanisms with AD, researchers have also attempted to investigate whether the AD-associated genes that have been identified are also involved in other neurodegenerative diseases, such as Parkinson's disease and amyotrophic lateral sclerosis.
Given the close association between the diagnosis, prevention, and treatment of AD, a comprehensive discussion on these keywords was conducted, aiming to enhance a holistic understanding of AD and elevate its value in clinical practice.Since there is no ideal drug that can completely cure AD, it is crucial to diagnose AD in a timely and accurate manner.MCI is an early clinical manifestation of AD, and patients at this stage experience mild impairment of memory and behavior.Early detection and diagnosis help patients to receive timely treatment, thus slowing down the progression of the disease.When AD progresses to an advanced stage, patients will experience severe cognitive impairment, eventually leading to dementia.A timely diagnosis of dementia not only helps doctors to develop a personalized treatment plan for the patient, but also provides the necessary care and emotional support to improve the patient's quality of life.LOAD is thought to be influenced by a combination of genes and the environment, and gene-environment interactions (GxE) may accelerate or exacerbate cognitive impairment [142].Currently GxE has received attention from researchers and a series of studies have been carried out to help better prevent AD.For example, cadmium (Cd), a toxic heavy metal, is mainly exposed to humans through diet (e.g., oysters, peanuts, animal offal, etc.) and smoking [143].Long-term exposure to Cd can damage organs such as the liver and kidneys.Moreover, Cd can cross the blood-brain barrier, leading to inflammation, neuronal apoptosis, and other pathologies that contribute to cognitive decline [144,145].Additionally, Cd is believed to exacerbate cognitive impairment in the presence of APOE4 [146].This phenomenon has also been observed in studies of another heavy metal, lead (Pb) [147].Therefore, it's reasonable to assume that chronic exposure to heavy metals might intensify or accelerate cognitive decline in individuals with APOE4 [146].Research suggests that the Mediterranean diet helps reduce the risk of AD and dementia, offering an effective approach for the prevention and treatment of AD [148,149].This effect is particularly pronounced in individuals who do not carry the APOE4 allele [150].
Currently, tools for diagnosing AD include psychological assessment, imaging and laboratory tests.Biomarkers can reflect the pathophysiological characteristics of AD by detecting specific substances in biological fluids [151].These markers can directly show the biological changes of the disease, which is valuable for early diagnosis of AD and dynamic monitoring of disease progression.Currently, the main biological tests are MRI, PET scanning and cerebrospinal fluid testing [152].Among these, cerebrospinal fluid is the most commonly used method to assess AD progression, primarily by detecting Aβ42 or Aβ42:40 ratios, as well as total tau and phosphorylated tau levels [152].In recent years, intensive research on genetic risk and pathological features of AD has provided new directions for the prevention and treatment of AD.Genes associated with AD identified through GWAS have attracted extensive attention from researchers.These genes play a crucial role in the onset and progression of AD, and targeted therapy against these genes may pave new avenues for the treatment of AD.
Based on the results of our analysis, the field is currently in a rapidly developing period and a large number of studies have been conducted.We note that current research mainly focuses on Caucasians with relatively few studies conducted on other races, which means that we may miss some key genetic variants in other populations.Due to genetic differences between races, genetic variations commonly found in Caucasians may be rare or non-existent in other races.The rs9331888 polymorphism of the CLU gene was found to affect AD risk in Caucasians, but there was no significant association in East Asian populations [73].CR1 rs6656401 is thought to have an impact on AD risk in Europeans, however whether it has an effect on other ethnic groups is controversial [153].Some scholars have suggested that the negative results of rs6656401 in East Asian populations may be related to the small sample size, and the association of CR1 rs6656401 with AD was confirmed by meta-analysis [154].Therefore, we suggest that more GWAS studies with large samples and multiple races should be conducted in the future to understand more about the genetic heterogeneity of AD in different populations.GWAS can identify risk loci associated with AD which help to explore the pathogenesis of AD and thus provide a theoretical basis for the next step in the treatment and prevention of AD.However, there is still a long way to go from genetic studies to clinical translation.In addition to determining the relevance of genes and AD, how these genes function in the pathogenesis of AD is the focus of research.Furthermore, a large number of experimental studies should be conducted to validate the results.In parallel, our findings on risk genes may guide the design of animal models that better reflect the characteristics of human AD.
Meanwhile, this study has certain limitations.Due to the limitation of the analysis software only articles in WOS were included in this study, and some articles from other databases may have been omitted.In addition, the language of this study was limited to English, and articles published in other languages were not included.At the same time, due to the short publication time of some studies, it will have an impact on the citation frequency statistics, resulting in the underestimation of some recently published articles, which will have a certain impact on the research results.
This study explored the GWAS in AD from 2002 to 2022 using CiteSpace and VOSviewer software.A comprehensive summary of the GWAS field of AD was provided by analyzing the number of publications, countries/regions and institutions of publication, authors and cited authors, highly citations, and research hotspots.This study could facilitate relevant researchers to gain insight into the developmental trends and hotspots in this field and further comprehend the genetic risk factors and mechanisms of AD.Moreover, this study also points out the shortcomings of the current study and the direction of future research, providing a valuable reference for researchers.Future GWAS in AD should be strengthened in terms of sample size, ethnic diversity, and experimental validation.Continued research into the genetic risk and mechanisms of AD and the development of targeted therapeutic in response to the findings are major focus of future research and urgently need to be tackled.